AITopics | interactive application

Collaborating Authors

interactive application

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SpeakStream: Streaming Text-to-Speech with Interleaved Data

Bai, Richard He, Gu, Zijin, Likhomanenko, Tatiana, Jaitly, Navdeep

arXiv.org Artificial IntelligenceMay-27-2025

--The latency bottleneck of traditional text-to-speech (TTS) systems fundamentally hinders the potential of streaming large language models (LLMs) in conversational AI. These TTS systems, typically trained and inferenced on complete utterances, introduce unacceptable delays - even with optimized inference speeds - when coupled with streaming LLM outputs. This is particularly problematic for creating responsive conversational agents where low first-token latency is critical. In this paper, we present SpeakStream, a streaming TTS system that generates audio incrementally from streaming text using a decoder-only architecture. SpeakStream is trained using a next-step prediction loss on interleaved text-speech data. During inference, it generates speech incrementally while absorbing streaming input text, making it particularly suitable for cascaded conversational AI agents where an LLM streams text to a TTS system. Our experiments demonstrate that SpeakStream achieves state-of-the-art latency results in terms of first-token latency while maintaining the quality of non-streaming TTS systems. Our demo website is available at https://apple.github.io/speakstream-demo. Index T erms --text-to-speech, speech synthesis, streaming Recent years have witnessed a surge of interest in speech interfaces for large language models (LLMs).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.19206

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DipMe: Haptic Recognition of Granular Media for Tangible Interactive Applications

Wang, Xinkai, Zhang, Shuo, Zhao, Ziyi, Zhu, Lifeng, Song, Aiguo

arXiv.org Artificial IntelligenceNov-13-2024

While tangible user interface has shown its power in naturally interacting with rigid or soft objects, users cannot conveniently use different types of granular materials as the interaction media. We introduce DipMe as a smart device to recognize the types of granular media in real time, which can be used to connect the granular materials in the physical world with various virtual content. Other than vision-based solutions, we propose a dip operation of our device and exploit the haptic signals to recognize different types of granular materials. With modern machine learning tools, we find the haptic signals from different granular media are distinguishable by DipMe. With the online granular object recognition, we build several tangible interactive applications, demonstrating the effects of DipMe in perceiving granular materials and its potential in developing a tangible user interface with granular objects as the new media.

dipme, granular media, recognition, (17 more...)

arXiv.org Artificial Intelligence

2411.08641

Country:

North America > United States > New York > New York County > New York City (0.05)
Africa > Tanzania > Zanzibar (0.04)
Africa > Tanzania > Mjini Magharibi Region > Zanzibar (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Energy (0.68)
Media (0.48)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Build a Named Entity Recognition App with Streamlit

#artificialintelligenceAug-31-2022, 20:30:34 GMT

In my previous article, we fine-tuned a Named Entity Recognition (NER) model, trained on the wnut_17[1] dataset. In this article, we show step-by-step how to integrate this model with Streamlit and deploy it using HugginFace Spaces. The goal of this app is to tag input sentences per user request in real time. Also, keep in mind, that contrary to trivial ML models, deploying a large language model on Streamlit is tricky. We also address those challenges.

data science project, entity recognition app, streamlit, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Beating Common Sense into Interactive Applications

Lieberman, Henry, Liu, Hugo, Singh, Push, Barry, Barbara

AI MagazineDec-15-2004

A long-standing dream of artificial intelligence has been to put commonsense knowledge into computers -- enabling machines to reason about everyday life. However, it is widely assumed that the use of common sense in interactive applications will remain impractical for years, until these collections can be considered sufficiently complete and commonsense reasoning sufficiently robust. Recently, at the Massachusetts Institute of Technology's Media Laboratory, we have had some success in applying commonsense knowledge in a number of intelligent interface agents, despite the admittedly spotty coverage and unreliable inference of today's commonsense knowledge systems.

artificial intelligence, interactive application, management and information, (4 more...)

AI Magazine

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)

Add feedback